30 research outputs found

    Size and depth of monotone neural networks: interpolation and approximation

    Full text link
    Monotone functions and data sets arise in a variety of applications. We study the interpolation problem for monotone data sets: The input is a monotone data set with nn points, and the goal is to find a size and depth efficient monotone neural network, with non negative parameters and threshold units, that interpolates the data set. We show that there are monotone data sets that cannot be interpolated by a monotone network of depth 22. On the other hand, we prove that for every monotone data set with nn points in Rd\mathbb{R}^d, there exists an interpolating monotone network of depth 44 and size O(nd)O(nd). Our interpolation result implies that every monotone function over [0,1]d[0,1]^d can be approximated arbitrarily well by a depth-4 monotone network, improving the previous best-known construction of depth d+1d+1. Finally, building on results from Boolean circuit complexity, we show that the inductive bias of having positive parameters can lead to a super-polynomial blow-up in the number of neurons when approximating monotone functions.Comment: 19 page

    Community detection and percolation of information in a geometric setting

    Full text link
    We make the first steps towards generalizing the theory of stochastic block models, in the sparse regime, towards a model where the discrete community structure is replaced by an underlying geometry. We consider a geometric random graph over a homogeneous metric space where the probability of two vertices to be connected is an arbitrary function of the distance. We give sufficient conditions under which the locations can be recovered (up to an isomorphism of the space) in the sparse regime. Moreover, we define a geometric counterpart of the model of flow of information on trees, due to Mossel and Peres, in which one considers a branching random walk on a sphere and the goal is to recover the location of the root based on the locations of leaves. We give some sufficient conditions for percolation and for non-percolation of information in this model.Comment: 21 page

    Is this correct? Let's check!

    Full text link
    Societal accumulation of knowledge is a complex process. The correctness of new units of knowledge depends not only on the correctness of new reasoning, but also on the correctness of old units that the new one builds on. The errors in such accumulation processes are often remedied by error correction and detection heuristics. Motivating examples include the scientific process based on scientific publications, and software development based on libraries of code. Natural processes that aim to keep errors under control, such as peer review in scientific publications, and testing and debugging in software development, would typically check existing pieces of knowledge -- both for the reasoning that generated them and the previous facts they rely on. In this work, we present a simple process that models such accumulation of knowledge and study the persistence (or lack thereof) of errors. We consider a simple probabilistic model for the generation of new units of knowledge based on the preferential attachment growth model, which additionally allows for errors. Furthermore, the process includes checks aimed at catching these errors. We investigate when effects of errors persist forever in the system (with positive probability) and when they get rooted out completely by the checking process. The two basic parameters associated with the checking process are the {\em probability} of conducting a check and the depth of the check. We show that errors are rooted out if checks are sufficiently frequent and sufficiently deep. In contrast, shallow or infrequent checks are insufficient to root out errors.Comment: 29 page

    Integrality gaps for random integer programs via discrepancy

    Get PDF
    We give bounds on the additive gap between the value of a random integer program max⁥cTx,Ax≀b,x∈{0,1}n\max c^\mathsf{T} x, Ax \leq b, x \in \{0,1\}^n with mm constraints and that of its linear programming relaxation for a range of distributions on (A,b,c)(A,b,c). Dyer and Frieze (MOR '89) and Borst et al (IPCO '21) respectively showed that for random packing and Gaussian IPs, where the entries of A,cA,c are independently distributed according to either uniform [0,1][0,1] or N(0,1)\mathcal{N}(0,1), that the integrality gap is bounded by Om(slog⁥2n/n)O_m(s \log^2 n / n) with probability at least 1−1/n−e−Ωm(s)1-1/n-e^{-\Omega_m(s)} for s≄1s \geq 1. In this paper, we extend these results to the case where AA is discretely distributed (e.g., entries {−1,0,1}\{-1,0,1\}), and where the columns of AA are logconcave distributed. Second, we improve the success probability from constant, for fixed ss and mm, to 1−1/poly(n)1-1/\mathrm{poly}(n). Using a connection between integrality gaps and Branch-and-Bound due to Dey, Dubey and Molinaro (SODA '21), our gap results imply that Branch-and-Bound is polynomial for these IPs. Our main technical contribution and the key for achieving the above results, is a new discrepancy theoretic theorem which gives general conditions for when a target tt is equal or very close to a {0,1}\{0,1\} combination of the columns of a random matrix AA. Compared to prior results, our theorem handles a much wider range of distributions on AA, both continuous and discrete, and achieves success probability exponentially close to 11 as opposed to constant. We prove this lemma using a Fourier analytic approach, building on the work of Hoberg and Rothvoss (SODA '19) and Franks and Saks (RSA '20) who studied similar questions for {−1,1}\{-1,1\} combinations
    corecore